Skip to content

fix(gguf): correct mismatched-shape error message in check_quantized_param_shape#13504

Open
Ricardo-M-L wants to merge 1 commit intohuggingface:mainfrom
Ricardo-M-L:fix/gguf-shape-error-message
Open

fix(gguf): correct mismatched-shape error message in check_quantized_param_shape#13504
Ricardo-M-L wants to merge 1 commit intohuggingface:mainfrom
Ricardo-M-L:fix/gguf-shape-error-message

Conversation

@Ricardo-M-L
Copy link
Copy Markdown
Contributor

What does this PR do?

Fixes the misleading error raised by GGUFQuantizer.check_quantized_param_shape when a loaded GGUF weight doesn't match the model's expected shape.

Before

inferred_shape = _quant_shape_from_byte_shape(loaded_param_shape, type_size, block_size)
if inferred_shape != current_param_shape:
    raise ValueError(
        f"{param_name} has an expected quantized shape of: {inferred_shape}, "
        f"but received shape: {loaded_param_shape}"
    )

The check compares inferred_shape against current_param_shape, but the message reports inferred_shape vs loaded_param_shape. Since inferred_shape is derived from loaded_param_shape, the two values on either side of the reported "mismatch" are effectively the same thing described at different unpacking stages — the shape the model actually expected (current_param_shape) never shows up in the message.

Concretely, the 9B Q8 GGUF failure noted in #13001 produced:

ValueError: double_stream_modulation_img.linear.weight has an expected quantized shape of: (24576, 4096), but received shape: torch.Size([24576, 8192])

…even though the model parameter was (36864, 6144), which is the real expected shape and the thing the user needs to see when diagnosing a Klein-vs-Dev/GGUF-variant mix-up.

After

<param_name> has an expected shape of: <current_param_shape>, but the loaded GGUF weight decodes to shape: <inferred_shape>

Now both sides of the comparison are visible, and the "expected" side actually reflects what the model wants.

Related

Partially addresses the error-message confusion noted by @Vargol in #13001 (comment). This PR only touches the error text — it does not change the detection logic or attempt to resolve the underlying Klein-vs-Dev GGUF shape-inference issue that @DN6 is tracking.

Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • Did you read the contributor guideline?
  • Did you write any new necessary tests? — N/A; this is a one-line error-message correction with no behavior change.

Who can review?

@DN6 @sayakpaul

@github-actions github-actions Bot added quantization size/S PR with diff < 50 LOC labels Apr 19, 2026
@sayakpaul sayakpaul requested a review from DN6 April 21, 2026 14:25
@sayakpaul
Copy link
Copy Markdown
Member

@Ricardo-M-L I am seeing that you're opening a lot of PRs in a very short period of time. I politely as you to reduce that pace a bit.

@Ricardo-M-L
Copy link
Copy Markdown
Contributor Author

Friendly ping — this PR has been approved. Is there anything else needed before merging? Happy to make any requested changes.

check_quantized_param_shape compares inferred_shape against
current_param_shape, but the error message printed inferred_shape
vs loaded_param_shape — and inferred_shape is derived from
loaded_param_shape, so the reported mismatch was effectively
self-referential and gave no signal about the model's expected shape.

Print current_param_shape (what the model expected) vs inferred_shape
(what the quantized weight decodes to) so the two sides of the
comparison are actually visible.

Noted by @Vargol in huggingface#13001.
@Ricardo-M-L Ricardo-M-L force-pushed the fix/gguf-shape-error-message branch from be36bde to 587609f Compare April 27, 2026 14:51
@github-actions github-actions Bot added size/S PR with diff < 50 LOC and removed size/S PR with diff < 50 LOC labels Apr 27, 2026
@Ricardo-M-L
Copy link
Copy Markdown
Contributor Author

Thank you for the feedback, @sayakpaul! I understand the concern about PR volume. I will be more mindful and focus on higher-impact contributions going forward. Sorry for the inconvenience.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

quantization size/S PR with diff < 50 LOC

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants